Background. Whole-genome sequencing (WGS) enables comprehensive detection of genetic variants but faces limitations in benchmarking due to incomplete reference datasets. Trio-based analysis, leveraging Mendelian inheritance, provides an alternative strategy for validating sequencing results and estimating error rates, particularly in regulatory genomic regions. Methods. We extended the nf-core/sarek WGS pipeline by integrating a module that collects parental and offspring allele information, extracts regulatory genomic regions, and computes Mendelian-consistency scores. The algorithm processes variant calls from parents and children to identify expected versus anomalous inheritance patterns. The module was implemented in C++ and integrated into the Nextflow workflow, supporting automated analysis of trio datasets. Results. The method was validated on two real trio datasets, comparing DeepVariant and HaplotypeCaller as variant callers. For both trios, DeepVariant consistently achieved higher sensitivity and precision, with statistically significant differences confirmed using 95% confidence intervals. These results demonstrate that the proposed approach enables effective benchmarking of variant-calling performance in non-benchmark datasets. Conclusions. The developed method provides a practical and scalable framework for quantifying WGS outcomes from trio data. By incorporating Mendelian-inheritance validation into existing pipelines, researchers can estimate sequencing error rates, compare variant callers, and optimize workflows in regulatory genomic regions. Our findings confirm the superior performance of DeepVariant over HaplotypeCaller for the studied datasets.
Loading....